-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter out Hive information_schema and sys #3008
Conversation
import static io.prestosql.tests.utils.QueryExecutors.onPresto; | ||
import static java.util.Objects.requireNonNull; | ||
|
||
public class TestHiveSchema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear Reviewer,
I know that the TestHiveSchema
is awfully long and looks like 90% redundant, but it is actually not.
return ImmutableList.of(schemaName.get()); | ||
} | ||
return listSchemaNames(session); | ||
} | ||
|
||
private static boolean filterSchema(String schemaName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a subjective comment, but to me if (!filterSchema(tableName.getSchemaName()))
isn't very intuitive. Perhaps if the we'd call the method something like isSystemSchema()
(of course, the return values would be reversed), those tests would read better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems filter
is overloaded, and doesn't really mean whether it's positive (filter) or negative (filter out, exclude).
the reason i picked this name is -- java.util.stream.Stream#filter
, io.prestosql.security.AccessControl#filterSchemas
...
hmm maybe isExcludedSchema
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am keeping filterSchema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isHiveMetadataSchema
?
filterSchema
sounds like *AccessControl#filterSchemas
. I don't like this name here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem is with semntics of the method
While it's true today, in general: metadata schema != we should hide it
so isHiveMetadataSchema
isn't IMO btter
presto-product-tests/src/main/java/io/prestosql/tests/hive/TestHiveSchema.java
Outdated
Show resolved
Hide resolved
presto-product-tests/src/main/java/io/prestosql/tests/hive/TestHiveSchema.java
Show resolved
Hide resolved
`information_schema` will be inaccessible anyway. `sys` could be accessible, but - it doesn't work (contains JdbcStorageHandler tables and Hive views) - exposing it may require proper handling in access control.
return ImmutableList.of(schemaName.get()); | ||
} | ||
return listSchemaNames(session); | ||
} | ||
|
||
private static boolean filterSchema(String schemaName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isHiveMetadataSchema
?
filterSchema
sounds like *AccessControl#filterSchemas
. I don't like this name here.
} | ||
if ("sys".equals(schemaName)) { | ||
// Hive 3's `sys` schema contains no objects we can handle, so there is no point in exposing it. | ||
// Also, exposing it may require proper handling in access control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by: exposing it may require proper handling in access control.
? What is sys
schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sys contains table paramters, partition paramters, some stats, etc.
return ImmutableList.of(schemaName.get()); | ||
} | ||
return listSchemaNames(session); | ||
} | ||
|
||
private static boolean filterSchema(String schemaName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this method under the last usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right
public void setUp() | ||
{ | ||
// make sure hive.default schema is not empty | ||
onPresto().executeQuery("DROP TABLE IF EXISTS hive.default.test_sys_schema_disabled_table_in_default"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is going to be called before each test. Maybe you could have a variable that would not create this table if it is already created, then tearDown
could be removed. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in general -- i agree
it's tiny overhead compared to the number of test queries. i choose to ignore this
import static io.prestosql.tests.utils.QueryExecutors.onPresto; | ||
import static java.util.Objects.requireNonNull; | ||
|
||
public class TestHiveSchema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a pity that this is a product test. After #3009 we could write "unit" test, but still it is not going to be much faster, but it would be easier to run tests from IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a necessity that there is a product test. i want to run with HMS whether information_schema/sys exists and where it doesn't
it would be good to have a unit test too though. out of scope
.satisfies(containsFirstColumnValue("information_schema")); | ||
assertThat(onPresto().executeQuery("SELECT table_name FROM information_schema.tables WHERE table_schema = 'information_schema'")) | ||
.containsOnly(allInformationSchemaTablesAsRows); | ||
Assertions.assertThat(onPresto().executeQuery("SELECT table_schema, table_name FROM information_schema.tables").rows().stream() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static import
assertThat
?
I use tempto QueryAssert.assertThat
and assertj Assertions.assertThat
-- it happens; not pretty; we already do this in other tests
presto-product-tests/src/main/java/io/prestosql/tests/hive/TestHiveSchema.java
Show resolved
Hide resolved
AC |
information_schema
will be inaccessible anyway.sys
could be accessible, but